Network Traffic Analysis: Hadoop Pig Vs Typical Mapreduce

نویسنده

  • Anjali P P
چکیده

Big data analysis has become much popular in the present day scenario and the manipulation of big data has gained the keen attention of researchers in the field of data analytics. Analysis of big data is currently considered as an integral part of many computational and statistical departments. As a result, novel approaches in data analysis are evolving on a daily basis. Thousands of transaction requests are handled and processed everyday by different websites associated with e-commerce, e-banking, e-shopping carts etc. The network traffic and weblog analysis comes to play a crucial role in such situations where Hadoop can be suggested as an efficient solution for processing the Netflow data collected from switches as well as website access-logs during fixed intervals.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Network Traffic Analysis: Hadoop Pig vs Typical MapReduce

Big data analysis has become much popular in the present day scenario and the manipulation of big data has gained the keen attention of researchers in the field of data analytics. Analysis of big data is currently considered as an integral part of many computational and statistical departments. As a result, novel approaches in data analysis are evolving on a daily basis. Thousands of transactio...

متن کامل

A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pig and Typical Mapreduce

Big data analysis has now become an integral part of many computational and statistical departments. Analysis of peta-byte scale of data is having an enhanced importance in the present day scenario. Big data manipulation is now considered as a key area of research in the field of data analytics and novel techniques are being evolved day by day. Thousands of transaction requests are being proces...

متن کامل

Phurti: Application and Network-aware Flow Scheduling for Mapreduce

Traffic for a typical MapReduce job in a datacenter consists of multiple network flows. Traditionally, network resources have been allocated to optimize network-level metrics such as flow completion time or throughput. Some recent schemes propose using application-aware scheduling which can reduce the average job completion time. However, most of them treat the core network as a black box with ...

متن کامل

Maximizing Data Locality in Hadoop Clusters via Controlled Reduce Task Scheduling

The overall goal of this project is to gain a hands-on experience with working on a large open-ended research-oriented project using the Hadoop framework. Hadoop is an open source implementation of MapReduce and Google File System, and is currently enjoying wide popularity. Students will modify the task scheduler of Hadoop, conduct several experimental studies, and analyze performance and netwo...

متن کامل

From SPARQL to MapReduce: The Journey Using a Nested TripleGroup Algebra

MapReduce-based data processing platforms offer a promising approach for cost-effective and Web-scale processing of Semantic Web data. However, one major challenge is that this computational paradigm leads to high I/O and communication costs when processing tasks with several join operations typical in SPARQL queries. The goal of this demonstration is to show how a system RAPID+, an extension o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013